OPTIMAL INVERSE LITTLEWOOD-OFFORD THEOREMS 

HOI NGUYEN AND VAN VU 



Abstract. Let rji,i = 1, . . . ,n he iid Bernoulli random variables, taking values ±1 with 
probability i. Given a multiset V of n integers vi, . . . ,Vn, we define the concentration 
probability as 

p{V) ■— supP(i;i77i + . . . v„rj„ = x). 

X 

A classical result of Littlewood-Offord and Erdos from the 1940s asserts that, if the Vi 
are non-zero, then p(V) is 0{n~^^^). Since then, many researchers have obtained improved 
bounds by assuming various extra restrictions on V. 

About 5 years ago, motivated by problems concerning random matrices, Tao and Vu 
introduced the Inverse Littlewood-Offord problem. In the inverse problem, one would like 
to characterize the set V, given that p{V) is relatively large. 

In this paper, we introduce a new method to attack the inverse problem. As an ap- 
plication, we strengthen the previous result of Tao and Vu, obtaining an optimal char- 
acterization for V. This immediately implies several classical theorems, such as those of 
Sarkozy-Szemeredi and Halasz. 

The method also applies to the continuous setting and leads to a simple proof for the 
/3-net theorem of Tao and Vu, which plays a key role in their recent studies of random 
matrices. 

All results extend to the general case when V is a subset of an abelian torsion-free 
group, and rji are independent variables satisfying some weak conditions. 



1. Introduction 

1.1. The Forward Littlewood-Offord problem. Let rji,i = l,...,n be iid Bernoulli 
random variables, taking values ±1 with probability |. Given a multiset V of n integers 
vi, . . . ,Vn, we define the random walk S with steps in V to be the random variable S := 
Sr=i ^t^j- The concentration probability is defined to be 

p{V) :=supP(5 = x). 

X 

Motivated by their study of random polynomials in the 1940s, Littlewood and Offord [7] 
raised the question of bounding p{V). (We call this the forward Littlewood-Offord problem, 
in contrast with the inverse Littlewood-Offord problem discussed in the next section.) They 
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showed that p{V) = 0{n ^/^logn). Shortly after the Littlewood-Offord paper, Erdos [T] 
gave a beautiful combinatorial proof of the refinement 



piV) < ^ = 0(n-V2). 
Erdos' result is sharp, as demonstrated hy V = {1, . . . , 1}. 

Notation. Here and later, asymptotic notations, such as 0,Q,Q, and so forth, are used 
under the assumption that n — )• oo. A notation such as Oc(.) emphasizes that the hidden 
constant in O depends on C. If a = we write 6 <C a or a ^ 6. All logarithms have a 

natural base, if not specified otherwise. 

The results of Littlewood-Offord and Erdos are classics in combinatorics and have generated 
an impressive wave of research, particularly from the early 1960s to the late 1980s. 

One direction of research was to generalize Erdos' result to other groups. For example, in 
1966 and 1970, Kleitman extended Erdos' result to complex numbers and normed vectors, 
respectively. Several results in this direction can be found in [31IH]. 

Another direction was motivated by the observation that ([1]) can be improved significantly 
by making additional assumptions about V. The first such result was discovered by Erdos 
and Moser [2], who showed that if Vi are distinct, then p{V) = 0(n~^/^ log n). They 
conjectured that the logarithmic term is not necessary, and this was confirmed by Sarkozy 
and Szemeredi 

Theorem 1.2. Let V be a set of n different integers, then 

p(l^) = 0(n-3/2). 

In [3] (see also in |23j). Halasz proved very general theorems that imply Theorem 11.21 and 
many others. One of his results can be formulated as follows. 

Theorem 1.3. Let I be a fixed integer and Ri be the number of solutions of the equation 
t'n H 1- = ^ii H 1- . Then 

p{V) = 0{n-^'-^Ri). 

It is easy to see, by setting 1 = 1, that Theorem 11.31 implies Theorem 11.21 

Another famous result in this area is that of Stanley [13], which, solving a conjecture of 
Erdos and Moser, shows when p{V) attains its maximum under the assumption that the Vi 
are different. 

Theorem 1.4. Let n be odd and Vq := { — [n/2j, . . . , [n/2j}. Then 

p{V)<piVo). 
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A similar result holds for the case of n being even jl3j . Stanley's proof of Theorem 11.41 used 
sophisticated machinery from algebraic geometry, particularly the hard Lefschetz theorem. 
A few years later, a more elementary proof was given by Proctor [9]. This proof also has 
an algebraic nature, involving the representation of the Lie algebra sZ(2,C). As far as we 
know, there is no purely combinatorial proof. 

It is natural to ask for the actual value of p(Vo). From Theorem[L2l one would guess (under 
the assumption that the elements of V are different) that 

p{Vo) = (Co + o(l))n-3/2 

for some constant Cq > 0. However, the algebraic proofs do not give the value of Cq. In 
fact, it is not obvious that lim„_j.oo n^^'^ p{Vo) exists. 

Assuming that Cq exists for a moment, one would next wonder if Vq is a stable maximizer. 
In other words, if some other set Vq has p{Vq) close to Con~^/^, then should Vq (possibly 
after a normalization) be "close" to Vq ? (Note that p is invariant under dilation, so a 
normalization would be necessary.) 



1.5. The inverse Littlewood-Offord problem. Motivated by inverse theorems from 
additive combinatorics (see [23^ Chapter 5]) and a variant for random sums in [2Ul Theorem 
5.2], Tao and the second author [TB] brought a different view to the problem. Instead of 
trying to improve the bound further by imposing new assumptions (as done in the forward 
problems), they tried to provide the complete picture by finding the underlying reason as 
to why the concentration probability is large (say, polynomial in n). 

Note that the (multi)-set V has 2" subsums, and p{V) > means that at least ^ of 
these take the same value. This observation suggests that the set should have a very strong 
additive structure. To determine this structure, we first discuss a few examples of V , where 
piy) is large. For a set A, we denote the set {ai + ■■■ + ai\ai ^ A} hy I A. 

Example 1.6. Let I = [—N^N] and vi,...,Vn be elements of I. Because S G nl, by the 
pigeon-hole principle, p{V) > = J7(^). In fact, a short consideration yields a better 
bound. Note that, with a probability of least .99, we have S £ Wy/nl. Thus, again by the 
pigeon-hole principle, we have p{V) = ^}{-^^-^). If we set N = vP~'^l'^ for some constant 
C > 1/2, then 

P(V) = (2) 

The next, and more general, construction comes from additive combinatorics. A very im- 
portant concept in this area is that of generalized arithmetic progressions (GAPs). A set Q 
is a GAP of rank r if it can be expressed as in the form 

Q = {o-o + a^i^i + • • • + XrUrlMi < Xj < for all 1 < i < r} 
for some {oq, . . . , Or}, {Mi, . . . , Af,.}, and {M(, . . . , M^}. 
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It is convenient to think of Q as the image of an integer box B := {(xi, . . . , x,.) E 7/'\Mi < 
rrii < M-} under the Unear map 

<I> : (xi, . . . , Xr) I— )• ao + xiai + • • • + x^a^. 

The numbers Oj are the generators of P, the numbers Mj and M- are the dimensions of 
P, and Vol((5) := \B\ is the volume of P. We say that Q is proper if this map is one-to-one 
or, equivalently, if \Q\ = Vol{Q). For non-proper GAPs, we, of course, have \Q\ < Yol{Q). 
If —Mi = M- for alH > 1 and ao = 0, we say that Q is symmetric. 

Example 1.7. Let Q be a proper symmetric GAP of rank r and volume N. Let vi, . . . , f„ 

be (not necessarily distinct) elements of P. The random variable S = Y17=i '^^''li l^^^kes values 
in the GAP nP. Because \nP\ < Vol{nB) = N , the pigeon-hole principle implies that 
piy) > 0(^;f]v)- f^-ct, using the same idea as in the previous example, one can improve 
the bound to // we set N = n^~'^/'^ for some constant C > r/2, then 

p{V) = O(-l). (3) 

The examples above show that, if the elements of V belong to a proper GAP with a small 
rank and small cardinality, then p{V) is large. A few years ago, Tao and the second author 
p!8| showed that this is essentially the only reason: 

Theorem 1.8 (Weak inverse theorem). [18] Let C,e > be arbitrary constants. There 
are constants r and C depending on C and e such that the following holds. Assume that 
V = {vi, . . . ,Vn} is a multiset of integers satisfying p{V) > n^^' . Then, there is a proper 
symmetric GAP Q with a rank of at most r and a volume of at most n^' that contains all 
but at most n^~'' elements ofV (counting multiplicity). 

Remark 1.9. The presence of a small set of exceptional elements is not completely avoidable. 
For instance, one can add o(logn) completely arbitrary elements to V and, at worst, only 
decrease p{V) by a factor of n^°^^\ Nonetheless, we expect the number of such elements 
to be less than what is given by the results here. 

The reason we call Theorem 11.81 weak is that C is not optimal. In particular, it is far from 
reflecting the relations in ([2]) and ([3]). In a later paper [16], Tao and the second author 
refined the approach to obtain the following stronger result. 

Theorem 1.10 (Strong inverse theorem). [IB] Let C and 1 > e be positive constants. 
Assume that 

p{V) > n-^. 

Then, there exists a proper symmetric GAP Q of rank r = Oc,s{^) that contains all but 
Or{n^~^) elements ofV (counting multiplicity), where 

|Q|=Oc,.(n^-^+^). 

The bound on \Q\ matches Example 11.71 up to the term. However, this error term seems 
to be the limit of the approach. The proofs of Theorems I1.8l and ll.l0l relv on a replacement 
argument and various lemmas about random walks and GAPs. 
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Let us now consider an application of Theorem 11.101 Note that Theorem 11.101 enables us 
to make very precise counting arguments. Assume that we would like to count the number 
of (multi)sets V of integers with max < N = n'-^^^^ such that p{V) > p := n~'~" . 

Fix (i > 1, and fix0 a GAP Q with rank r and volume IQI = n'~"~2 . The dominating term 
in the calculation will be the number of multi-subsets of size n of Q, which is 

Motivated by questions from random matrix theory, Tao and the second author obtained 
the following continuous analogue of this result. 

Definition 1.11 (Small ball probability). Let z be a real random variable, and let V = 
{vi, . . . , Vn} be a multiset in R'^. For any r > 0, we define the small ball probability as 

Pr,z{V) := sup P{viZi + ... VnZn G B{x, r)), 

where zi, . . . , z„ are iid copies of z, and -B(x, r) denotes the closed disk of radius r centered 
at X in R'^. 

Let n be a positive integer and /3, p be positive numbers that may depend on n. Let Sn,p,p 
be the collection of all multisets V = {vi, . . . ,Vn}-,Vi G R^ such that X^ILi ~ ^ ^"^^ 
PfS,'ri(y) ^ Pi where t] has a Bernoulli distribution. 

Theorem 1.12 (The /3-net Theorem). ^ Let < e < 1/3 and C > be constants. Then, 
for all sufficiently large n and (5 > exp(— n^) and p > n^'^' , there is a set S C (R^)" of size 
at most 

p-"n""(i-^) +exp(o(n)) 

such that for any V = {vi, . . . , f„} G 'Sn,i3,p, there is some V = {v[, . . . , v'^) G S such that 
— t^ilb < /3 for all i. 

The theorem looks a bit cleaner if we use C instead of R^ (as in [21J). However, we prefer 
the current form, because it is more suitable for generalization. The set S is usually referred 
to as a /3-net of Sn,i3,p- 

Theorem 1 1.121 is at the heart of establishing the Circular Law conjecture in random matrix 
theory (see |2H 117]). It also plays an important role in the study of the condition number 
of randomly perturbed matrices (see ^22j ) . Its proof in [2T] is quite technical and occupies 
the bulk of that paper. 

-'^ A more detailed version of Theorems 11.81 and 11.101 tells us that there are not too many ways to choose 
the generators of Q. In particular, if TV = n*^'^', the number of ways to fix these is negligible compared to 
the main term. 
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However, given the above discussion, one might expect to obtain Theorem 11.121 as a simple 
corollary of a continuous analogue of Theorem 11.101 However, the arguments in [21] have 
not yet provided such an inverse theorem (although they did provide a sufficient amount 
of information about the set S to make an estimate possible). The paper [lOj by Rudelson 
and Vershynin also contains a characterization of the set S, but their characterization has 
a somewhat different spirit than those discussed in this paper. 

2. A NEW APPROACH AND NEW RESULTS 

In this paper, we introduce a new approach to the inverse theorem. The core of this new 
approach is a (long-range) variant of Freiman's famous inverse theorem. 

This new approach seems powerful. First, it enables us to remove the error term in 
Theorem 11.101 resulting in an optimal inverse theorem. 

Theorem 2.1 (Optimal inverse Littlewood-Offord theorem, discrete case). Let e < 1 and 

C he positive constants. Assume that 

p{V) > n~^. 

Then, there exists a proper symmetric GAP Q of rank r = Oc\e{^) that contains all but at 
most en elements ofV (counting multiplicity), where 

\Q\ = oc,eipiyr'ri-^^). 

This immediately implies several forward theorems, such as Theorems 11.21 and 11.31 For 
example, we can prove Theorem 11.21 as follows. 

Proof, (of Theorem II. 2|) Assume, for contradiction, that there is a set y of n distinct 
numbers such that p{V) > cin~^/^ for some large constant ci to be chosen. Set e = .1,C = 
3/2. By Theorem 12.11 there is a GAP Q of rank r and size Oc,e{^n''~"~2) that contains at 
least .9n elements from V. This implies \Q\ > .9n. By setting ci to be sufficiently large and 
using the fact that C = 3/2 and r > 1, we can guarantee that \Q\ < .8n, a contradiction. □ 

Theorem 11.31 can be proved in a similar manner with the details left as an exercise. 

Similar to I18j. our method and results can be extended (rather automatically) to much 
more general settings. 

General V. Instead of taking y to be a subset of Z, we can take it to be a subset of any 
abelian torsion- free group G (thanks to Freiman isomorphism, see Section H]). We can also 
replace Z by the finite field Fp, where p is any sufficiently large prime. (In fact, the first 
step in our proof is to embed V into Fp.) 

General r]. We can replace the Bernoulli random variables by independent random variables 
r]i satisfying the following condition. There is a constant c > and an infinite sequence of 
primes p such that for any p in the sequence, any (multi)-subset V of size n of Fp and any 
tGFp 
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fl\Bep{rj^v.,t)\<eM-c^\\ — f) (5) 

1=1 i=i ^ 

where ||x|| denotes the distance from x to the closest integer (we view the elements of Fp 
as integers between and p — 1) and ep{x) := ex.p{2n\/ —Ix / p) . 

Example 2.2. (Lazy random walks) Given a parameter < /i < 1, /et r/f 6e iid copies of a 
random variable where rj^ = 1 or —1 with probability /u/2, and rj^ = with probability 
1 — II. The sum 



1=1 

can be viewed as a lazy random walk with steps inV.A simple calculation shows 

2Trx 

bjep[r]x) = (1 — ^) + ^cos . 

p 

It is easy to show that there is a constant c > depending on /i such that 

K\ '2i'KX I , 1 1 ii9\ 

1 — ^j + /icos 1 < exp(— c|| — II ). 
p p 

Example 2.3. (^-bounded variables) It suffices to assume that there is some constant < 
^ < 1 such that for all i 

|Eep(ryix)| < (1 - ^) + /icos . (6) 

p 

Theorem 2.4. The conclusion of Theorem \2. 1\ holds for the case when V is a multi-subset 
of an arbitrary torsion-free abelian group G and r/j,l < i < n are independent random 
variables satisfying ([5]). 

In some applications, we might need a version of Theorem 12.11 with a smaller number of 
exceptional elements. By slightly modifying the proof presented in Section [5l we can prove 
the following result. 

Theorem 2.5. Let e < 1 and C be positive constants. Assume that 



p{V) > n 



-c 



Then, for any n'^ < n' < n, there exists a proper symmetric GAP Q of rank r = Oe.ciX) 
that contains all but n' elements ofV (counting multiplicity), where 



\Q\ = Oc,e{p-'/n'''-'^). 



8 



HOI NGUYEN AND VAN VU 



Remark 2.6. In an upcoming paper [8], we are able to address the unresolved issues con- 
cerning Theorem 11.41 by following the method used to prove Theorem 12.11 We prove that 

p(Vo) = {yj^ + o(l))n^'^/^. More important, we obtain a stable version of Theorem 11.41 

which shows that, if p{V) is close to (y^24/7r + o(l))n~'^/^, then V is "close" to Vq. As a 
byproduct, we obtain the first non-algebraic proof for the asymptotic version of the Stanley 
theorem. 

We now turn to the continuous setting. In this part, we consider a real random variable z 
such that there exists a constant Cz such that 

P(l<ki-^2|<C,)>l/2, (7) 

where zi,Z2 are iid copies of z. We note that Bernoulli random variables are clearly of 
this type. (Also, the interested reader may find ([7]) more general than the condition of the 
K-controlled second moment defined in [21] and the condition of bounded third moment in 
|10j.) In the statement above, Cz is not uniquely defined. In what follows, we will take the 
smallest value of Cz- 

We say that a vector v G R*^ is 5-close to a set Q C R*^ if there exists a vector q ^ Q such 
that — q\\2 < 6. A set X is (5-close to a set Q if every element of X is (5-close to Q. The 
analogue of Example 11.71 is the following. 

Example 2.7. Let Q be a proper symmetric GAP of rank r and volume N in R^. Let 
vi,...,Vn be (not necessarily distinct) vectors that are 0{l3n~^^'^)-close to Q. If we set 
\Q\ = n*^~2 for some constant C > r/2, then 

pp,,iy) = ^i^)- (8) 

Thus, one would expect that, if pp^ziV) is large, then (most of) V is 0(/5n~^/^)-close to a 
GAP with a small volume. Confirming this intuition, we obtain the following continuous 
analogue of Theorem 12.11 

Theorem 2.8 (Optimal inverse Littlewood-Offord theorem, continuous case). Let (5, C > 

be arbitrary constants and fi > Q be a parameter that may depend on n. Suppose that 
V = {vi, . . . ,Vn\ is a (multi-) subset ofR!^ such that X]r=i ll^illi — ^ ^^^^ ^ ^'^^ large 
small ball probability 

p ■■= P0,z(y) > n-^, 

where z is a real random variable satisfying ([7]). Then, there exists a proper symmetric GAP 
Q of rank d < r = 0(1) so that all but at most 6n elements ofV (counting multiplicity) are 
0{(3^^j^)-close to Q, where 

\Q\ = 0{p-H^-''-+''^l^n^~'+^^'^). 

The theorem is optimal in the sense that the exponent (— r + d) /2 of n cannot generally be 
improved (see Appendix |B] for more details). 
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Theorem 12.81 is a special case of the foUowing more general theorem. 

Theorem 2.9 (Continuous Inverse Littlewood-Offord theorem, general setting). Let < 

e < 1;0 < C be constants. Let (3 > Q he a parameter that may depend on n. Suppose that 
V = {vi, . . . ,fn} is a (multi-) subset ofJH^ such that Yli=i ll^dli ~ ^ ^'^'^ ^^^^ ^ ^'^^ large 
small ball probability 

P ■■= Pl3,z{V) > n-^, 

where z is a real random variable satisfying ([7]) . Then, the following holds. For any number 
< n' < n, there exists a proper symmetric GAP Q = {Y^l^i Xigi : \xi\ < Li} such that 



• (Full dimension) There exists y ^ /c ^ Vn/ such that the dilate P := /3 ■ Q 
contains the discrete hypercube {0,1}°'. 

• (Approximation) At least n — n' elements ofV are 0{j)-close to Q. 

• (Small rank and cardinality) Q has constant rank d < r = 0(1), and cardinality 

\Q\ = Oip-'n'^-'+'^y^). 

• (Small generators) There is a non-zero integer p = 0{Vn') such that all steps gi 
of Q have the form g^ = {gn, . . . ,gid), where gij = with pij £ Z and pij = 

Theorem 12.91 implies the following corollary (see Appendix [B] for a simple proof), from 
which one can derive Theorem 11.121 in a straightforward manner (similar to the discrete 
case discussed earlier). 

Corollary 2.10. Let < e < 1;0 < C be constants. Let f3 > be a parameter that 
may depend on n. Suppose that V = {vi, . . . ,Vn} is a (multi-) subset of IV^ such that 
Y17=i ll^illi — 1 '^'^'^ ^^'^^ ^ ^c-^ large small ball probability 

P := P^A^) > 

where z is a real random variable satisfying ^ . Then the following holds. For any number 
n' between and n, there exists a proper symmetric GAP Q = {X]i=i ^i9i ■ — ^i} such 
that 

• At least n — n' elements ofV are j3-close to Q. 

• Q has small rank, r = 0(1), and small cardinality 




• There is a non-zero integer p = 0{Vn') such that all steps gi of Q have the form 
9i = {gn, ■ ■ -igid), where gij = with pij £ Z and pij = 0{l3'^\/rJ). 



Note that the approximations obtained from Corollary 12.101 are rougher than those from 
Theorem I2.9p . However, the bound on |Q| is improved in some critical cases (particularly 
when r = d). 



10 



HOI NGUYEN AND VAN VU 



In the above theorems, the hidden constants could depend on previously set constants 
e,C,Cz,d. We could have written Of,^c,Cz,d and <^e,c,Cz,d everywhere, but these notations 
are somewhat cumbersome, and this dependence is not our focus. 

Proof, (of Theorem ll.l2"|) Set n' := n^~~ (which is ^ n"^ as e < 1/3). Let S' be the collection 
of all subsets of size at least n — n' of GAPs whose parameters satisfy the conclusion of 
Corollary [230l 

Because each GAP is determined by its generators and dimensions, the number of such 
GAPs is bounded by ((/3-i\/^)\/^)C'(i)(£^)C'(i) = exp(o(n)). (The term {^T^^'> bounds 

the number of choices of the dimensions Mj.) Thus, = + exp(o(n)). 

We approximate each of the exceptional elements by a lattice point in /?• (Z/d)"'. Thus, if we 
let S" to be the set of these approximated tuples, then < X]j<„/(0(/3~^))* = exp(o(n)) 
(here, we used the assumption /3 > exp(— n^)). 

Set S := S' X S" . It is easy to see that \S\ < 0(n~^/^^'^p~^)" + exp(o(n)). Furthermore, if 
p{V) > n~'-'^^\ then V is /3-close to an element of S, concluding the proof. □ 

3. The long range inverse theorem 
Let us first recall a famous theorem by Preiman [23\ Chapter 5]. 

Theorem 3.1 (Freiman's inverse theorem). Let ^ be a positive constant and X a subset of 
a torsion-free group such that \2X\ < 7|X|. Then, there is a proper symmetric GAP Q of 
rank at most r = 0^(1) and cardinality 0^(|X|) such that X C Q. 

In our analysis, we will need to deal with an assumption of the form \kX\ < k'^\X\, where 
7 is a constant but k is not. (Typically, k will be a positive power of \X\.) We successfully 
give a structure for X under this condition in the following theorem, which we will call the 
long range inverse theorem. 

Theorem 3.2 (Long range inverse theorem). Let j > be constant. Assume that X is a 
subset of a torsion-free group such that G X and \kX\ < k'^\X\ for some integer k > 2 
that may depend on \X\. Then, there is proper symmetric GAP Q of rank r = 0(7) and 
cardinality 0'y{k~''"\kX\) such that X C Q. 

Note that for any given e > and for any sufficiently large k, it is implied from Theorem 

13.21 that the rank of Q is at most 7 + e. The implicit constant involved in the size of Q can 

20(7) 

be taken to be 2^ , which is quite poor. Although we have not elaborated on this bound 
substantially, our method does not seem to say anything when the polynomial growth with 
a size of kX is replaced by something faster. 

Theorem 13 . 2 1 will serve as our main technical tool. This theorem can be proved by applying 
an earlier result [19]. We give a short deduction in Appendix [XI 
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4. Freiman isomorphism 

We now introduce the concept of Freiman isomorphism that allows us to transfer an additive 
problem to another group in a way that is more flexible than the usual notion of group 
isomorphism. 

Definition 4.1 (Freiman isomorphism of order k). Two sets V, V of additive groups G, G' 
(not necessarily torsion-free) are a Freiman isomorphism of order k (in generalized form) if 
there is an injective map / from V to V such that f{vi) + • • • + /(wfc) = f{v'i) + • • • + /(v^) 
in G' if and only if vi + ■ ■ ■ + = v'l + ■ ■ ■ + v'f^ in G . 

The following theorem allows us to pass from an arbitrary torsion-free group to Z or cyclic 
groups of a prime order (see [23, Lemma 5.25]). 

Tiieorem 4.2. Let V be a finite subset of a torsion-free additive group G. Then, for any 
integer k, there is a Freiman isomorphism (j) : V ^ of order k to some finite subset 

(j){V) of the integers Z. The same is true if we replace Z by Fp, if p is sufficiently large, 
depending on V. 

An identical proof to that in [23j implies the following stronger result. 

Theorem 4.3. Let V be a finite subset of a torsion-free additive group G. Then, for any 
integer k, there is a map (j) : V ^ 4'^) to some finite subset (p{v) of the integers Z such 
that 



vi + --- + Vi = v[ + --- + Vj^ (j){vi) + ■■■ + (p{vi) = (^{v[) + ... (j){vj) (9) 

for all i,j < k. The same is true if we replace Z by Fp, if p is sufficiently large, depending 
on V . 

By Theorem 14. 3 1 a large prime p and set C Fp exist such that ([9]) holds for all i, j < 
Hence, we infer that 

p{V) = p{Vp). 

Thus, instead of working with a subset y of a torsion-free group, it is sufficient to work 
with a subset of Fp, where p is sufficiently large. 

To end this section, we record a useful fact about GAPs, as follows. Assume that ^ is a 
dense subset of a GAP Q. Then, the iterated sumsets kA contain a structure similar to Q 
(see jHl Lemma 4.4], [151 Lemma B3]). 

Lemma 4.4 (Sarkozy-type theorem in progressions). Let Q = {aixi + • • • + Or-x^ : \xi\ < 
Mi, 1 < i < r} be a proper GAP in a torsion-free group of rank r. Let A C Q be a 
symmetric subset such that \A\ > 6\Q\ for some < 6 < 1. Then, there exists positive 
integers 1 < m,l <^s,r 1 such that Qi C 2mA, where Qi is the GAP 
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Qi = {laixi + • • • + larXr '■\xi\ < Mi/i^ , 1 < i < r}. 



5. Proof of Theorem 12.11 

Embedding. The first step is to embed the problem into the finite field Fp for some prime 
p. In the case when the Vi are integers, we simply take p to be a large prime (for instance, 
p > 2"'(^"^j^ + 1) suffices). If y is a subset of a general torsion- free group G, one can 
use Theorem 14.31 



From now on, we can assume that Vi are elements of Fp for some large prime p. We view 
elements of Fp as integers between and p — I. We use the shorthand p to denote p{V). 

Fourier Analysis. The main advantage of working in Fp is that one can use discrete Fourier 
analysis. Assume that 

p = p{V) = P{S = a), 
for some a G Fp. Using the standard notation ep{x) for exp(27r-v/— Ix/p), we have 



p = P^S = a) = E^Yl - «)) = E ep{CS)ep{-Ca). (10) 

By independence, 



n n ^ 

Eep(e5) = nep(er/.^;i) =nco«^^- (H) 



It follows that 



where we made the variable change ^ — )■ .^/2 (in Fp) to obtain the last identity. 

By convexity, we have that |sin7rz| > 2||z|| for any z G R, where ||z|| := ||^;||r/z is the 
distance of z to the nearest integer. Thus, 



cos — < 1 - -sm^ — < 1 - 2 - M < exp(-2 - M , (13) 
p 2 p p p 

where, in the last inequality, we used that fact that 1 — y < exp(— y) for any < y < 1. 
Consequently, we obtain the key inequality 
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^ ieFp i ^ ^ ^eFp i=i ^ 

Large level sets. Now, we consider the level sets 5m := {CI J27=i ^ "t-}- We have 

<p<-y exp(-2 V ll^f ) <- + -y exp(-2(m - 1))\S^\. 

^eFp i=l ^ ^y-^ 

Because X]m>i exp(— m) < 1, there must be a large level set Sm such that 

IS'ml exp(-m + 2) > pp. (15) 
In fact, because p > n~'-" , we can assume that m = O(logn). 
Double counting and the triangle inequality. By double -counting, we have 

^ n ^ 



p ^ — ' ■^^ — ' p 



So, for most 



for some large constant Cq. 



Set Co = By averaging, the set of Vi satisfying ([T6|) has a size of at least (1 — e)n. We 
call this set V' . The set V\V' has a size of at most en, and this is the exceptional set that 
appears in Theorem 12.11 In the rest of the proof, we are going to show that V is a dense 
subset of a proper GAP. 

Because || • || is a norm, by the triangle inequality, we have, for any a £ kV , 

Y^f-^f<e^\s^\. (17) 

^-^ p n 

More generally, for any I < k and a E IV' , 



Y^fA\'<e^\s^\. (18) 
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Dual sets. Define := {a\ X^^g^^ lly^lP ^ 2Sol'^™l} (^^^ constant 200 is ad lioc, and any 
sufficiently large constant would be sufficient). can be viewed as some sort of a dual set 
of Sm- In fact, one can show, as far as cardinality is concerned, it does behave like a dual 



\S*J < jf.. (19) 

To see this, define Ta := X^^es™, Using the fact that cos 2ttz > 1 — 100||2;|p for any 

z E R, we have, for any a S 



Ta> ^(l-100||^f)>l|5^|. 

However, using the basic identity X^^gp cos = plx=0i we have 



Y,T!<2p\Sm\. 

aeFp 

(jl9p follows from the last two estimates and averaging. 

Set k := ciy^, for a properly chosen constant ci = ci(Co). By p^ . we have Ui^^lV C 
Set V" = y U {0}; we have kV" C U {0}. This results in the critical bound 



\kV"\ = Oi-^) = 0{p-^ exp(-m + 2)). (20) 

I I 

The long range inverse theorem. The role of Fp is no longer important, so we can view 
the Vi as integers. The inequality (|20p is exactly the assumption of the long range inverse 
theorem. 

With this theorem in hand, we are ready to conclude the proof. A slight technical problem is 
that V is not a set but a multiset. Thus, we apply Theorem l3.2l with X as the set of distinct 



elements of V (note that kX = kV" if k > 2). Furthermore, k = = 
< rf' is bounded from above by k'^'-"^^. 

It follows from Theorem 13.21 that X is a subset of a proper symmetric GAP Q of rank 
r = Oc,e(l) and cardinality 



OcAk-'\kX\) = OcAk~''\kV"\) = Oc,e (^p-' exp{-m){^r'- 
concluding the proof. 
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Remark 5.1. To prove Theorem 12.51 in the section describing double counting and the tri- 
angle inequality, we define V to be the collection of all Vi G V satisfying 

/ , II II — jl'-'ml- 

Next, with k = ci\J~^ for some sufficiently small ci, we obtain a bound similar to ()20p . 

where \kV"\ = 0(p~^exp(— m + 2)). We then conclude Theorem 12.51 by applying the long 
range inverse theorem. 



6. Proof of Theorem 12.91 

This proof will essentially follow the same steps as in the discrete case, with some additional 
simple arguments. 

Given a real number w and a variable z, we define the z-norm of w by 

\\w\U:= {nw{z^-Z2)\?f'\ 
where zi, Z2 are two iid copies of z. 

Fourier analysis. Our first step is to obtain the following analogue of ()14p . using the Fourier 
transform. 

Lemma 6.1 (bounds for small ball probability). 

„ n 

PrAy)<^M^r^) / exp(-^||(t;„OI|^/2-vr||ei)^ie 

This lemma is basically from |21) : the proof is presented in Appendix [Cl for the reader's 
convenience. 

Next, consider the multiset Vg := ■ V = {j3~^vi, . . . ,/3~^f„}. It is clear that 



We now work with Vg. Thus pi,z(Vs) > n~'-'^^^ and YlveVfj ll^lP ~ 



2 



For concision, we write p for pi,z(V^). Set M := 2^1ogn, where A is sufficiently large. 
From Lemma l6.ll and the fact that p > n~'^^^\ we easily obtain 

/ exp(-i^|K^;,OI|^-vr||e||i)dC>^. (21) 
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Large level sets. For each integer < m < M, we define the level set 



S^,:=iC^R':^\\{v,ml + 



ml 



< m 



Then, it follows from ([21]) that X]m<A/ /^("^m) ^^P(~T + 1) ^ Pi where fi{.) denotes the 
Lebesgue measure of a measurable set. Hence, there exists m < M such that n{Sm) > 
pexpif-2). 

Next, because 5m C B{0,^/m), by the pigeon-hole principle there exists a ball B{x,^) C 
B{0, \/rn) such that 

fi{B{x, i) n Sm) > cMSm)m-''/^ > Q/>exp(^ - 2)m-^/\ 

Consider ^1,^2 £ B{x,l/2) n Sm- By the Cauchy-Schwarz inequality (note that ||.||2 is a 
norm), we have 

E \\{v,{^i-^2))\\l<^m. 
Because ^1 — ^2 G B{0, 1) and ijl{B{x, ^) n Sm — B{x, |) n Sm) > lJ-{B{x, |) n 5m), if we put 

n 

T■.= {^eB{0,l),Y,m,v^)\\l<^m}, 
1=1 



then 



p(T) >Crf/5exp(--2)m-'^/2_ 

Discretization. Choose to be a sufficiently large prime (depending on the set T). Define 
the discrete box 



Bo := {{h/N, . . . , kd/N) ■.kieZ,-N< k, < N} . 

We consider all shifted boxes x + Bq, where x G [0, 1/N]'^. By the pigeon-hole principle, 
there exists xq such that the size of the discrete set {xq + BQ)riT is at least the expectation 
\{xo + Bq) n T| > N'^n{T) (to see this, we first consider the case when T is a box). 

Let us fix some ^0 £ {xq + ^o) H T. Then, for any ^ E {xq + Bq) Pi T, we have 
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J2\\i^^^o-0\\l<2iYl 11^^'^) II' + E IK^'^o)||^ < 16m. 
veVp \veVi3 tJSVa / 

Note that - ^ e Bi := Bq - Bq = {{h/N, kd/N) ■.ki(^Z, -2N < h < 2N}. Thus, 
there exists a subset S of size at least CdN^pexp{^ — 2)m~^^'^ of Bi such that the fohowing 
holds for any s £ S: 

^ \\{v,s)\\l<l6m. 

Double counting. We let y = zi — Z2, where 2:1,^2 are iid copies of z. By the definition of 
S, we have 



5^5] ||(r;,.)||^<16m|5| 

EyEE ll2/^^'")llR/z<16m|5|. 

seS veVi3 

It is then implied that there exists 1 < |yo| ^ C'z such that 

E E llyo(^,^>ll^/z < 16m|5|P(l < \y\ < C,)-\ 
However, by property ([7]), we have P(l < \y\ < Cz) > 1/2. Thus, 

EE llyo(^'^)llR/z<32m|S|. 

seSveVi3 

Let n' be any number between and n. We say that u G Vg is bad if 

Eiiyo^^'«)iiR/z>^^^- 

Then, the number of bad vectors is at most re'. Let be the set of remaining vectors. 
Thus, contains at least n — n' elements. In the remainder of the proof, we show that Vg 
is close to a GAP, as claimed in the theorem. 

Dual sets. Consider an arbitrary v € V^. We have X]se5 llyo('5, < 32m|S'|/re'. 



18 HOI NGUYEN AND VAN VU 



Set k := \J Q^^'i^ , and let V^' := k{V^ U {0}). By the Cauchy-Schwarz inequality (see ((T8 
for any a £ V^' , we have 



J]27r2|K.,yoa>||^/z<^, 



which implies 



\S\ 

^cos(27r(s,yoo>) > 
ses 

Observe that, for any x G C(0, (the ball of radius l/256(i in the ||.||oo norm) and any 

s £ S C C(0,2), we always have cos(27r(s, x)) > 1/2 and sin(27r(s, x)) < 1/12. Thus, for 
any x G C{0,^), 



5:cos(2vr(.,(yoa + x)))>^-^ = M 

s&S 



However, 



/ \ /. '^os{2Tr {s , x)) \ dx < 2, / exp (27r\/— l(si — S2,x)') dx 



Hence, we deduce the following: 



/^xe[o,JV]M (2^cos(27r(s,x))) > (-^) I <-d (|g|/g)2 

Now, using the facts that S" is large, l^l N'^pexp{^ — 2)m~'^/^ and N was chosen to be 
large enough for yoV^' + C{0, C [0, iV]'', we have 



^(2/0^; + C(0, ^)) «, exp(-^ + 2)m'^/2. 



Thus, we obtain the following analogue of (pO]) : 
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/X [kiV;, U {0}) + C(0, ^^)) «d p-V^M-"^ + 2)"^'/'. (22) 



T/ie /on^ range inverse theorem. Our analysis again relies on the long range inverse theorem. 

by its closest vector in {^Y, 



Let D := 1024(iyo- We approximate each vector v' of by its closest vector in 



, I a ,, yd , „j 



Let be the collection of all such a. Because X^^/gy^ Ib'll2 = ^(/^ )> we have 



E ll«ll2 = 0d,c.(A;'r'). (23) 



It follows from ((221) that 



\k[Afi + Co(0, 1))| = 0,,c, (p-i(i?fc)V'exp(-- + 2)m^/2 



= Od,c.(/^-^fc'^exp(-^ + 2)m^/2j, 
where Co(0, r) is the discrete cube {(zi, . . . , z^) G Z*^ : |zj| < r}. 

Now, we apply Theorem l3.2l to the set ^^ + Co(0, 1) (note that G ^^)- That lemma implies 
there exists a proper GAP P = XiQi : \xi\ < Ni} C Z'^ containing Af^ + Co(0, 1) with 

a small rank r = 0(1) and small size 



l^'l = Od,c. ((p-iA:'^exp(-^ +2)m^/2fc-^' 
= Od,cAP " ). 

Moreover, we learned from the proof of Theorem 13.21 and Lemma 14.41 that kQ can be 
contained in a set ck^Ajs + Co(0, 1)) for some c = 0{1). Using ([23|) . we conclude that all 
generators Qi of Q are bounded. 



\9i\\2 = Od,cAkr')- 



Next, because Co(0, 1) C Q, the rank r of P is at least d. It is a routine calculation to see 
J_ 

Dk 



that Q := -Mr ■ P satisfies all of the required properties in Theorem [27 
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Appendix A. Proof of the long range inverse theorem 



The key lemma to prove our long range inverse theorem is an earlier result by Tao and the 
second author ( [19\ Theorem 1.21]), given below. 

Lemma A.l. Let e > 0,7 > be constants. Assume that X is a subset of integers such 
that \kX\ < k'^\X\ for some number k > 2. Then, kX is contained in a symmetric 2-proper 
GAP Q with rank r = O^^e(l) and cardinality 0^^e{\kX\). 

Next, if kX C kQ, where Q is a GAP, then it is natural to suspect that X C Q, but this is 
not always true. However, the conclusion holds if kQ is 2-proper and ^ X. 

Lemma A. 2. (Dividing sumsets relations) Assume that G X and that P = {Yll=i^i'^i '■ 
\xi\ < Ni} is a symmetric 2-proper GAP that contains kX. Then X C {Y^i^iXiUi : \xi\ < 
2Ni/k}. 

A good way to keep this lemma in mind is the following. Consider the relation X C P. It is 
trivial that this relation can always be multiplied, namely, for all integers k > 1, kX C kP. 
The above lemma asserts that, under certain assumptions, the relation kX C kP can be 
divided, giving X G P. 

Proof, (of Lemma IA.2P Without a loss of generality, we can assume that k = 2K It is 
sufficient to show that 2^-^X C {ELi ^itti ■.\xi\< Ni/2}. Because G X, 2^-^X C 2'X C 
P, any element x of 2^~^X can be written as x = X]i=i ^iOi, with < Ni. Now, because 
2x G P <Z 2P and 2P is proper (as P is 2-proper), we must have < \2xi\ < N^. □ 

It is clear that Theorem 13.21 follows from Lemma lA.ll and Lemma IA.2I 



Appendix B. Remarks on Theorem 12.91 



The purpose of this section is to give an example showing that the bound in Theorem 12.91 
cannot be improved and to provide a proof for Corollary 12.101 

First, consider the set U := [—2n, —n] U [n, 2n]. Sample n points vi, . . . ,Vn from U inde- 
pendently with respect to the (continuous) uniform distribution, and let A be the set of 
sampled points. Let ^ be the Gaussian random variable A^(0, 1), and consider the sum 



S ■■= Vi^i H h VnCn, 

where are iid copies of ^. 

S has a Gaussian distribution with a mean and variance 0(n^), with a probability of one. 
Thus, for some interval / of length 1, P{S E /) > Cn~^^'^, for some constant C. 

Set n' = 6n, for some small positive constant 5. Theorem 12.91 states that (most of) A is 
Q^_2S3y close to a GAP of rank r and volume 0{n^~^). We show that one cannot replace 
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this bound by 0(n^ 2 for any e. There are only three possible values for r: r = 1,2,3. 
Our claim follows from the following simple lemma, whose proof remains as an exercise. 

Lemma B.l. Let C,6,e be positive constants and n — )• 00. The following hold with a 
probability of 1 — o(l) (with respect to the random choice of A). 

• A does not contain any subset of cardinality (1 — 6)n that is ^^^-close to a GAP 
of rank 1 and volume of at most Cn^/^"*^. 

• A does not contain any subset of cardinality (1 — 6)n that is '^^" -c/ose to a GAP 
of rank 2 and volume of at most Cn^~''. 

• A does not contain any subset of cardinality (1 — d)n that is '""^" -c/ose to a GAP 
of rank 3 and volume of at most Cvi}l'^~'^ . 

The construction above can also be generalized to higher dimensions, but we do not attempt 
to do so here. 

For the remainder of this section, we prove Corollary I2.1UI 
We consider the following two cases. 

Case 1 : r > d + 1 . Consider the GAP P at the end of the proof of Theorem 12.91 Recall 
that |P| = Od,c.(p~'n'(''-'-)/2) ^ Orf,c.(/9-V^)- Let 



It is clear that Q satisfies all of the conditions of Corollarv 12.101 (Note that, in this case, 
we obtain a stronger approximation; almost all elements of Y are 0(^^=^)-close to Q.) 

Case 2: r = d. Because the unit vectors = (0, . . . , 1, . . . , 0) belong to P = XiQi : 

\xi\ < Ni} C Z*^, the set of generators gi,i = 1, . . . , d forms a base with the unit determinant 
of R'^. In P, consider the set of lattice points with all coordinates divisible by k. We observe 
that (for instance, by [23[ Theorem 3.36]) this set can be contained in a GAP P' of rank 

d and cardinality max (0(pr|P|, l) = max ^0(/?~^/n'^''^), 1^ . (Here, we use the bound 
|P| = 0(/>-iexp(-^)m°'/2).) Next, define 



^ Dk 



It is easy to verify that Q satisfies all of the conditions of Corollary 12.101 (Note that, in 
this case, we obtain a stronger bound on the size of Q.) 
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Appendix C. Proof of Lemma [67T] 



We have 



i=l 1=1 



^ ^(^' ^)) = Yl - ^Il2 ^ 

(n N 
exp(— 7r|| ZjVj — > exp(— vrr^) 

n 

< exp(7rr^)E exp(— 7r|| ZjWj — XII2). 



Note that 



exp(-7r||a;||^) = / e((x, 0) exp(-7r||^|||)(i^. 



We thus have 



n „ n 

P(X] e ^(^' ^)) ^ exp(7rr2) / Ee(( J]] z^v^,0)e{-{x, 0) exp{-7TU\\l)dC 
i=i •^^'^ 1=1 



Using 



i=l i=l 

and 

\Be{z,{v^,0)\ < |Ee(zi(7;i,0)lV2 + l/2<exp(-||(t;„OII^/2), 

we obtain 



n 111 

' / exp(-V 

•'^ i=l 
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